ppiTrim: constructing non-redundant and up-to-date interactomes
نویسندگان
چکیده
Robust advances in interactome analysis demand comprehensive, non-redundant and consistently annotated data sets. By non-redundant, we mean that the accounting of evidence for every interaction should be faithful: each independent experimental support is counted exactly once, no more, no less. While many interactions are shared among public repositories, none of them contains the complete known interactome for any model organism. In addition, the annotations of the same experimental result by different repositories often disagree. This brings up the issue of which annotation to keep while consolidating evidences that are the same. The iRefIndex database, including interactions from most popular repositories with a standardized protein nomenclature, represents a significant advance in all aspects, especially in comprehensiveness. However, iRefIndex aims to maintain all information/annotation from original sources and requires users to perform additional processing to fully achieve the aforementioned goals. Another issue has to do with protein complexes. Some databases represent experimentally observed complexes as interactions with more than two participants, while others expand them into binary interactions using spoke or matrix model. To avoid untested interaction information buildup, it is preferable to replace the expanded protein complexes, either from spoke or matrix models, with a flat list of complex members. To address these issues and to achieve our goals, we have developed ppiTrim, a script that processes iRefIndex to produce non-redundant, consistently annotated data sets of physical interactions. Our script proceeds in three stages: mapping all interactants to gene identifiers and removing all undesired raw interactions, deflating potentially expanded complexes, and reconciling for each interaction the annotation labels among different source databases. As an illustration, we have processed the three largest organismal data sets: yeast, human and fruitfly. While ppiTrim can resolve most apparent conflicts between different labelings, we also discovered some unresolvable disagreements mostly resulting from different annotation policies among repositories. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/ppiTrim.html.
منابع مشابه
ASIC Design of Butterfly Unit Based on Non-Redundant and Redundant Algorithm
Fast Fourier Transform (FFT) processors employed with pipeline architecture consist of series of Processing Elements (PE) or Butterfly Units (BU). BU or PE of FFT performs multiplication and addition on complex numbers. This paper proposes a single BU to compute radix-2, 8 point FFT in the time domain as well as frequency domain by replacing a series of PEs. This BU comprises of fused floating ...
متن کاملBi-Level Control Policy for Redundant Repairable Multicomponent System with Reneging, Set Up and Vacation
This paper deals with bi-level control policy and queuing analysis of a machine repair problem. The model is developed by incorporating mixed standbys (cold and warm), reneging, set up and vacation time. The repair facility consists of two heterogeneous repairmen in the system. The life and repair time of the failed units and also their set up times are assumed to be exponentially distributed. ...
متن کاملMyProteinNet: build up-to-date protein interaction networks for organisms, tissues and user-defined contexts
The identification of the molecular pathways active in specific contexts, such as disease states or drug responses, often requires an extensive view of the potential interactions between a subset of proteins. This view is not easily obtained: it requires the integration of context-specific protein list or expression data with up-to-date data of protein interactions that are typically spread acr...
متن کاملA convex optimization approach for identification of human tissue-specific interactomes
MOTIVATION Analysis of organism-specific interactomes has yielded novel insights into cellular function and coordination, understanding of pathology, and identification of markers and drug targets. Genes, however, can exhibit varying levels of cell type specificity in their expression, and their coordinated expression manifests in tissue-specific function and pathology. Tissue-specific/tissue-s...
متن کاملThe Economic Efficiency Trend of Date Orchards in Saravan County
The purpose of this study is to evaluate the efficiency of date growers in Saravan County using non-parametric methods. The measurement of date farmers’ efficiency and the comparison of their performance to with one another can play an important role in improving their efficiency and productivity. One of the common methods to measure efficiency is data envelopment analysis (DEA). Despite its ad...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2011 شماره
صفحات -
تاریخ انتشار 2011